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ABSTRACT 


Sixteen  English  vowels  and  diphthongs  were  recorded  by  four 
male  and  four  female  speakers.  For  each  vowel  and  diphthong,  120 
responses  were  obtained  from  a  panel  of  six  phonetically  trained  lis¬ 
teners  at  several  levels  of  noise  and  quiet;  each  of  five  levels  of  noise 
were  matched  for  listening  difficulty  with  five  levels  of  quiet.  The 
levels  of  difficulty  ranged  from  approximately  25  per  cent  to  85  per 
cent  correct . 

The  results  on  vowel-diphthong  intelligibility  support  the  con¬ 
clusions  that  vowels  and  diphthongs  {!)  are  significantly  different  in 
intelligibility;  (2)  have  a  fairly  stable. order  of  intelligibility,  similar 
in!  noise  and  quiet  and  among  the  eight  speakers,  though  more  stable 
among  speakers  of  the  same  sex;  (3)  improve  in  intelligibility  at 
different  rates  as  listening  conditions  are  improved;  and  (4)  are  more 
intelligible  from  male  speakers. 

In  regard  to  vowel  and  diphthong  confusability,  results  sup¬ 
port  the  conclusions  that  (1)  a  great  many  significant  confusions  exist 
among  the  vowels  and  diphthongs  under  fairly  difficult  listening  con¬ 
ditions,  (2)  atjeost  one  significant  confusion  exists  for  each  vowel 
and  diphthong,  (3)  each  vowel  and  diphthong  is  a  significant  confusion 
for  at  least  one  other  vowel  or  diphthong,  and  (4)  confusions  bear  a 
reciprocal  relationship  to  one  another. 
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INTRODUCTION 


Considerable  research  has  been  directed  toward  determining  the  intelligibility  of  the 
consonantal  sounds  of  English.  Very  little  has  been  done  on  the  intelligibility  of  the  English 
vowels,  particularly  at  low  intensities  and  relativ'ely  difficult  signal-|o-noise  ratios. 

Knowledge  of  the  intelligibility  of  the  vowels  at  such  levels  should  permit  more  ef¬ 
fective  use  of  the  English  language  in  the  selection  or  formation  of  words  for  use  in  commun¬ 
ication.  Speech  via  telephone  or  radio,  and  the  development  of  an  International  Language 
for  Aviation  Communication,  are  particular  areas  in  which  accuate  transmission  under  less 
than  perfect  listening  conditions  is  needed. 

Further,  the  study  of  vowels  is  important  in  that  they  are  the  loudest  parts  of  words . 

At  low  intensities  or  when  consonants  are  masked  by  noise,  *  the  expectation  is  that  the  vowel 
or  vowels  might  still  be  heard.  Correct  identification  of  the  word  might  still  occur.  Conse¬ 
quently,  the  use  of  a  word  containing  an  intelligible  vowel  should  be  more  satisfactory  for 
communication  than  the  use  of  the  same  consonantal  structure  with  a  less  intelligible  vowel. 
In  this  way,  a  possible  "building  block"  opprooch  might  be  employed  in  the  construction  of 
a  vocabulary  fpr  use  when  optimum  listening  conditions  are  not  assured.^ 

II  REVIEW  OF  PREVIOUS  RESEARCH 

Difficulties  in  the  auditory  perception  of  an  Individual  word  often  are  related  to 
faintness  or  indistinctness  of  the  component  speech  sounds.  Faintness  and  indistinctness. 

In  turn,  are  associated  with  low  intensities  of  the  signal,  the  presence  of  masking  noise, 
or  high  confusability  between  sounds.  Regarding  the  vowels,  studies  by  Sacia  and  Beck^; 
Black^;  Fairbanks,  House,  and  Stevens^;  and  Curry*^;  have  been  concerned  with  determining 
the  range  of  relative  intensities  of  the  vowels  in  decibels.  The  results  of  these  studies  are 
not  entirely  consistent.  For  example,  the  study  by  Black  correlates  v/ith  the  study  by  Sacia 
and  Beck  with  a  rank  order  correlation  of  .47,  while  the  study  by  Fairbanks,  House,  and 
Stevens  correlates  with  the  Curry  study  with  a  rank  order  correlation  of  .87.  The  latter 
work  was  additionally  concerned  with  the  threshold  identification  of  the  vowels,  and  Curry 
found  that  vowels  having  greater  intensity  were  not  always  the  most  easily  identified.  He 
concluded  that  intensity  vi'as  not  the  only  factor  involved  in  vowel  identification. 

Siegenthaler,^  in  a  study  of  sustained  vowels  at  supra-threshold  levels,  reported 
that  many  of  the  sustained  vowls  tended  to  sound  like  [a]  whenever  the  initiations  and 
conclusions  of  the  vowels  were  removed.  Koch^  states  that  sustained  vowels  should  not  be 
used  in  co.Titnunication  because  such  steady  sounds  tend  to  lose  their  identification.  Moser, 
Dreher,  and  O'Neill^  investigated  the  masking  of  English  monosyllabic  words  by  prolonged 
vowel  sounds.  Prolonged  vowel  sounds  were  found  to  differ  greatly  in  masking  speech; 
vowels  with  concentration  of  energy  in  the  700-1000  cps  range  were  the  most  effective 
masking  agents;  a  monosyllable  containing  the  same  vowel  as  that  employed  as  a  masking 
agent  was  not  more  likely  than  others  to  be  blotted  out;  monosyllables  containing  Co]  and 
[  o]  v/ere  most  affected,  those  with  [  a]  the  least;  the  rank  order  of  vowel -masking  effec¬ 
tiveness  was  [e],  [k],  Co],  Cu3,  Ci],  Ca],  Ci],  Co],  and  Cul  . 
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A  number  of  studies  have  considered  formant  structure  and  position^esponsible  for  vowel 
identification.  Tiffany^*^  found  a  high  correlation  between  formant  position  and  threshold  iden¬ 
tification  of  vowels.  Lehiste  and  Peterson^  in  aistudy  of  filtered  vowels,  found  that  only  the 
first  three  formants  were  important,  and  that  different  vowels  depended  on  different  formants  and 
the  interaction  of  these  formants  for  identification.  Peterson  collaborated  with  Barneyl2  on 
methods  of  vowel  study  to  determine  the  relative  intelligibility  of  the  vowels.  Lists  contained 
10  monosyllabic  words  each  beginning  with  Chi  and  ending  v/ith  Cd]  and  differing  in  the  vov/el  . 
These  were  presented  to  listeners  at  a  supra-liniinal  level  of  70  db  (re  0.0002  dyne/cm^).  They 
report:  "Certain  of  the  vowels  (  Li],  L  S']  ,  L  £e]  ,  and  Li*]  )  are  generally  better  understood  than 
others,  possibly  because  they  represent  limit  positions  of  the  articulatory  mechanism."  They 
also  found  that,  when  observers  confused  one  vowel  with  another,  the  two  vov/els  nearly  always 
had  adjacent  positions  on  the  vowel  loop;  i.e.,  F  t]  was  taken  for  Le  ]  and  Le  ]  wcs  called 
either  Li]  or  [re]  .  Edmonson  and  Horv/itz^^  reported  that  vowel  confusions  ore  the  result  of 
the  overlap  of  the  first  tv./o  formants,  although  for  the  most  part  vov/els  are  recognized  a  high 
percentage  of  the  time.  Miller^"^  used  c  670-cycle  low-pass  filter  to  eliminate  the  second 
formant  of  16  vowels  and  diphthongs  presented  in  a  consonantal  L  h  -  d]  context.  Confusions 
showed  very  closely  the  pattern  that  would  be  predicted  from  formant  analysis.  Since  the  filter 
would  not  affect  temporal  characteristics.  Miller  offers  as  the  simplest  explanation  of  the  dis¬ 
crepancies  the  deduction  that  duration  is  an  important  feature.  He  concludes  that  there  are  nt 
least  three  distinctive  features  for  simple  vowels:  duration,  frequency  of  the  first  formant,  end 
frequency  of  the  second  formant.  Pickett^'^  studied  the  effects  of  various  noise  spectra  on  the 
intelligibility  of  vowels  and  found  that  all  vowels  are  not  affected  in  the  some  way  by  the  seme 
noise  spectra.  Significant  shifts  in  vowel  confusions  occurred  with  changes  in  noise  spectra,  and 
tliase  changes  were  consistent  wit.h  the  formant  theory.  The  intelligibility  of  c  sound  under  such 
conditions  could  be  determined  b\'  its  relative  intensity  or  its  characteristic  phonetic  nature  . 

III  PURPOSES  OF  THIS  STUDY 

The  purposes  of  this  study  were  to  determine:  (1)  the  differences  in  intelligibility  of 
isolated  vowels  and  diphthongs  at  low  intensities  and  relatively  difficult  signa!-to-noise  ratios; 

(2)  the  similarity  of  vov/el  and  diphthong  rank  order  intelligibility  in  quiet  and  noise;  (3)  the 
stability  of  vov/el  and  diphthong  rank  order  intelligibility  as  the  intensity  of  the  stimuli  are 
successively  Increased;  (4)  the  consistency  of  vowel  end  diphthong  rank  order  intelligibility 
among  various  speakers;  (5)  the  similarity  of  intelligibility  values  of  the  vowels  end  diphthongs 
between  male  and  female  speakers;  (6)  the  principal  confusions  between  the  vowels  end  diph¬ 
thongs  under  conditions  of  quiet  and  noise;  and  (7)  the  similarity  of  confusabi I ity  values  among 
related  pairs  of  errors. 

IV  METHOD 

Stimuli  end  Recording  Procedure:  Randomized  lists  of  isolated  vowels  and  diphthongs 
were  recorded  by  eight  speakers.  Each  speaker  recorded  a  different  randomization  of  80 
stimuli;  each  randomization  contained  five  presentations  of  each  vowel  and  diphthong.  The 
16  common  vowels  and  diphthongs  used  were:  C  i  ]  ,  li  ]  ,  Le],  Le],  Los],  La],  L  o]  ,  La], 
[s'],  Lo],  Cu],  Lu],  Lai],  Leu],  t-oi],  and  Liu]  .*  Segments  of  a  typical  recording  ^ 

2  * 

*For  those  unfamiliar  with  phonetic  symbols  the  above  vowels  and  diphthongs  are  identified  by 
underlining  sounds,  in  the  fol  lowing  common  words:  '  HE,  HIT,  HAY^HECK,  HAT,  -HOT, 
HAV/K,  HUT,  HER,  HOE  ,  HOOK,  VfHO,  HIGH,  HOW,  HOIST,“nd  HUE.  ~  ~ 


fol  low: 


® 

"This  is  Speaker  Number  Five, 

Number  1,  write  La], 

Number  2,  write  Ci], 

Number  3,  write  Cai],  ® 

Number  80,  write  [  £  ] 

In  addition,  an  orientation  and  training  tape  was  prepared; 
for  listeners  to  become  familiar  with  the  voices  of  the  eight  speakers  and  the  stimuli  to  be  iden¬ 
tified.  It  also  provided  the  materials  for  a  listener  training  program,  and  the  means  of  deter¬ 
mining  the  average  auditory  detection  threshold  of  the  listeners. 

Recording  was  done  with  a  tope  recorder  (Ampex  600),  using  a  condenser  microphone 
(Altec  21-B)  positioned  at  the  corner  of  the  mouth,  lightly  touching  the  cheek.  The  original 
recordings  were  played  through  a  laboratory  signal-to-noise  equalizer^*^  and  re-recorded.  In 
this  operation  the  word  "write"  of  every  carrier  phrase  was  equated  to  within  -  1  db.  However, 
the  relationship  of  each  stimulus  to  its  introductory  carrier  phrase  was  maintained.  For  example, 
if  the  "write"  was  raised  one  or  two  db  in  intensity,  the  stimulus  vowel  was  raised  by  the  same 
amount.  After  all  of  the  carrier  phrases  were  equated,  each  was  increased  10  do  to  ensure  that 
the  carrier  words  would  be  clear  enough  to  prepare  the  listeners  for  the  stimuli  and  enable  them 
to  write  their  responses  in  the  appropriate  spaces  on  the  answer  forms. 

Speakers:  Four  males  and  four  females  of  General  American  dialect  were  selected  and 
trained  as  speakers.  All  v/ere  familiar  v/ith  phonetics  and  were  experienced  as  laboratory  talkers. 
Detailed  Instructions  were  given  to  keep  the  carrier  word  and  the  stimulus  as  near  to  equal  inten¬ 
sity  as  possible,  and  to  make  the  stimuli  equal  in  duration.  Practice  was  given  via  speaking  into 
a  microphone  connected  to  a  tape  recorder  equipped  with  a  VU  (volume  unit)  meter.  In  this  way, 
the  speakers  were  able  to  monitor  the  level  of  the  carrier  v/ord  and  to  establish  a  more  homogene¬ 
ous  pattern  for  use  in  the  acutal  recording,  during  v/hich  the  VU  meter  was  not  used.  Therefore  , 
each  speaker  used  his  ov/n  natural  feedback  mechanism  to  maintain  his  vowels  at  their  own  natural 
level  rather  than  at  a  constont  energy^  level .  It  v/as  assumed  that,  by  keeping  the  vocal  level 
constant,  the  inherent  intelligibility  of  each  vowel  and  diphthong  would  be  maintained. 

Listeners:  Six  research  assistants,  three  male  and  three  female,  of  the  Psycholinguistics 
Laboratory  at  The  Ohio  State  University  served  as  listeners.  All  were  trained  listeners,  had 
normal  hearing,  and  had  bed  at  least  one  course  in  phonetics. 

Dependent  and  Independent  Variables:  In  accordance  with  the  purposes  of 'this  study, 
two  dependent  variables  (or  properties)  ofwovvels  were  ofginterest.  These  were  intelligibility 
and  confusability .  Intelligibility  was  operationally  defined  as  the  percentage  of  correct  iden¬ 
tifications  in  the  total  number  of  choices  made.  Confusability  was  defined  as  the  percentage 
of  the  total  incorrect  identifications  in  which  a  particular  vowel  or  diphthong  is  substituted  for 
the  stimulus  actually  spoken;  omissions  were  excluded  from  the  computations. 

*  The  purposes  of  the  study  additionally  called  for  the  systematic  variation  of  several 
independent  variables.  These  involved  differences  in  stimuli  presented  (16vovvels  and  diph¬ 
thongs),  differences  in  the  intensity  of  stimulus  presentation  and  differences  in  signal-to-noise 
ratios,  differences  in  the  sex  of  speakers,  and  differences  in  individuals  as  speakers. 
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this  provided  an  opportunity 


The  intertsity  of  the  stimulus  presentation  in  quiet  was  varied  to  produce  five  levels.  The 
first  level  of  presentation  was  4  do  above  the  average  detection  threshold  of  the  listeners.  At  this 
level  o  number  of  the  stimuli  were  below  threshold.  In  the  original  recording  the  vowels  and  diph¬ 
thongs  were  produced  at  a  constant  vocal  level  and  recorded  on  this  basis  rather  than  adjusted  to 
a  constant  peak  energy  level;  therefore,  some  of  the  stimuli  may  have  had  more  energy  than  others 
even  to  the  point  of  being  supra-threshold  at  the  lowest  level  of  presentation.  The  second  level 
of  presentation  was  8  db  above  the  average  detection  threshold,  the  third  level,  12  db,  the  fourth 
level,  16  db,  and  the  fifth  level,  20  db. 

A  careful  attempt  was  made  to  equate  the  conditions  in  noise  with  the  conditions  in  the 
quiet  with  respect  to  listening  comprehension.  The  rationale  employed  v^as  that  if  the  number 
of  correct  answers  to  the  training  tape  at  level  one  in  noise  was  approximately  the  same  as  at 
level  one  in  quiet,  this  approximation  would  continue  for  all  pairs  of  levels.  A  level  in  noise 
was  selected  on  the  basis  of  the  previous  experience  of  the  laboratory  staff  to  yield  the  same  num¬ 
ber  of  correct  answers  as  were  giveri  at  the  most  difficult  level  in  quiet.  This  chosen  level  was 
a  -10  db  signal-to-noise  ratio  in  reference  to  the  stimuli.  Succeeding  signal-to-noise  ratios 
were  -6,  -2,  +2,  and  +6  db  in  that  order.  When  the  data  were  tabulated  (see  Table  I),  it  was 
apparent  that  the  percentages  of  correct  responses  in  quiet  and  noise  were  quite  satisfactorily 
equalized  at. each  of  the  five  levels  of  listening  difficulty.  This  equating  of  difficulty  can  also 
be  observed  in  Figure  1. 

The  signal-to-noise  ratios  were  obtained  in  the  following  manner:  the  output  of  a  flat 
noise  generator  (Grason-Stadler,  Model  455-B)  was  fed  into  the  laboratory  signal-to-noise 
equalizer^^  to  cause  a  certain  deflection  of  the  needle  on  the  microammeter.  The  noise  was 
filtered  in  the  laboratory  equalizer  by  a  low-pass  filter  cutting  off  at  4500  cps  with  a  slope 
of  -18  db  per  octave.  The  speech  signals  were  also  fed  into  the  equalizer  where  they  were 
adjusted  by  means  of  vertical  row  of  five  indicator  lights.  When  the  intensity  of  the  signal 
was  sufficient  to  cause  the  lower  four  lights  to  glow,  the  peak  voltage  of  the  signal  was  very 
closely  approximated  to  the  rms  voltoge  of  the  noise.  Since  all  the  carrier  words  had  been 
equated  earlier,  the  speech  and  noise  voltages  were  mixed  through  attenuators  set  for  the  desired 
signal-to-noise  ratio.  After  the  signal-to-noise,  ratio  was  set  relative  to  the  carrier  words,  the 
carrier-to-noise  ratio  was  adjusted  so  as  to  be  10  db  greater  than  the  vowel-to-noise  ratio.  In 
this  way,  the  carrier  words  would  be  clear  enough  to  enable  the  listeners  to  locate  the  appropriate 
spaces  on  the  answer  forms.  In  the  testing  in  noise,  the  noise  was  maintained  at  the  same  level 
and  the  signal  was  adjusted  in  intensity  to  produce  the  various  signal-to-noise  ratios. 

Test  Administration:  Prior  to  the  actual  testing,  the  listeners  were  acquainted  with  the 
16  different  stimuli  and  the  speaker  voices  they  would  hoar,  and  practice  sessions  were  con¬ 
ducted  to  produce  stable  scores  with  the  training  tope.  This  same  tape  was  also  used  to  deter¬ 
mine  the  average  auditory  detection  threshold  of  the  listeners  to  the  stimuli,  in  both  ascending 
and  descending  manner.  The  range  of  the  mean  detection  thresholds  for  the  individual  listeners 
was  2  db . 

Listeners  were  seated  in  a  prefabricated  sound-treated  room  (Industrial  Acoustics  Company, 
Model  403) .  The  ambient  noise  level  in  the  test  chamber  during  testing  was  30  db  as  measured 
with  a  sound-level  meter  (H.H.  Scott,  Model  410-A,  C  scale).  Listening  wos  monaural  with  the 
stimuli  delivered  to  the  preferred  ear  of  the  listener,  Listeners  transcribed  their  responses  on 
specially  prepared  answer  forms.  ® 
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FIGURE  1 


PERCENTAGES  OF  CORRECT  RESPONSES  TO  VOWEL  STIMULI  IN  QUIET  AND  NOISE 

AT  VARIOUS  LEVELS  OF  LISTENING  DIFFICULTY 


Levels 

of  Listening  Difficulty 

Per 

1 

2 

3  2 

5 

Cenl^ 

Quiet  Noise 

Quiet  Noise 

Quiet  Noise  Quiet 

Noise 

Quiet  Noise 

Correct 

4  db  - 10  db 

8  db  -6  db 

1 2  db  -2  db  16  db 

+2  db 

20  db  -1-6  db 

83.7 

83.7 

73.7 

73.41 

'  i 

59.2 

i 

! 

i 

,59.8 

1 

1 

1 

39.9 

41  .0 

i 

i 

I 

j 

24.01  22.7 

! 

j 

i 

1 

1 

1 

1  ! 

Tests  in  the  quiet  were  administered  through  a  tape  recorder  (Ampex  600)  to  six  sets  of 
headphones  (Telephonies  TDH-39,  mounted  in  MX-41/AR  cushions).  The  output  of  the  recorder 
was  fed  to  an  attenuator  (Hewlett-Packard),  then  into  a  mixing  transformer  (UTC,  Model  CG- 
137),  and  finally  into  a  listening  circuit  containing  the  six  headphones.  The  accuracy  of  the 
attenuation  response  was  checked  with  a  vacuum-tube  voltmeter  (Hewlett-Packard,  Model  400- D) . 
Both  the  10  db  and  the  1  db  settings  gave  accurate  attenuation  settings.  The  same  test  list  re¬ 
corded  by  a  speaker  was  used  at  each  of  the  five  levels  in  succession,  beginning  with  the  most 
difficult  level  until  all  five  listening  conditions  had  been  completed.  For  example,  the  listeners 
heard  one  speaker  reading  a  randomized  list  of  80  vowels  with  the  level  of  the  auditory  stimuli 
set  4  db  above  the  average  detection  threshold  of  the  listeners.  Answer  sheets  were  immediately 
collected,  the  tape  was  rewound,  and  the  listeners  were  given  a  short  break  to  eliminate  fatigue. 
Next,  the  level  was  set  8  db  above  detection  threshold.  Again,  the  papers  were  collected,  the 
tape  rewound,  and  this  time  the  level  was  set  at  12  db.  The  was  repeated  at  the  16-db  and  the 

20-db  levels  for  a  total  of  five  levels. 

o 

Conditions  for  testing  in  noise  were  similar  to  those  used  for  testing  in  quiet  except  that 
the  tests  were  conducted  in  the  Psycholinguistics  Laboratory  instead  of  the  l.'A.C.  room.  The 
ambient  noise  level  in  the  laboratory  during  the  testing  was  42  db,  this  reading  being  taken 
from  a  sound-level  meter  (H.H.  Scott,  Model  410-B,  C  scale).  It  was  assumed  that  the  masking 
noise  fed  into  the  earphones  was  sufficient  to  mask  the  ambient  noise  in  the  room.  When  questioned, 
the  subjects  reported  that  the  only  sound  they  heard  was  that  coming  to  them  through  the  earphones. 
The  administration  of  the  test  in  noise  was  in  other  respects  the  same  as  the  administration  of  the 
test  in  quiet,  with  the  most  difficult  level  being  presented  first  and  the  next  four  levels  becoming 
progressively  easier. 

In  that  "the  same  test  was  used  at  each  of  five  levels  in  succession,"  the  question  could 
be  raised  as  to  how  much  learning  at  one  level  was  carried  to  the  next.  Tv/o  major  factors  reduce 
the  likelihood  that  the  list  was  learned  sufficiently  to  effect  significantly  the  intelligibility  values 
at  succeeding  levels.  First,  each  list  contained  80  items,  each  item  isolated  and  free  of  context 
or  meaning.  Second,  each  list  was  presented  at  the  most  difficult  level  initially  and  at  successively 
easier  levels.  The  listener  would  immediately  discover  at  each  successive  level  that  the  stimuli  are 
clearer.  In  such  a  circumstance,  it  was  reasoned  that  he  would  find  it  psychologically  more  econom¬ 
ical  to  rely  on  listening  to  clearer  stimuli  rather  than  upon  memory  of  those  stimuli  he  had  heard 
under  less  favorable  conditions  while  remembering  as  v/ell  their  positions  in  a  series  of  80  unrelated 
items. 

IV  ANALYSIS  OF  DATA  AND  RESULTS 

The  responses  of  the  listeners  were  tabulated  and  analyzed  in  a  variety  of  ways  in  order  to 
yield  answers  to  questions  pertinent  to  the  purposes  of  the  study.  This  section  of  the  report  is 
organized  in  terms  of  these  questions  and  the  results  relevant  to  each  question. 

1  .  Do  the  vowels  and  diphthongs  differ  significantly  in  intelligibility? 

Listener  responses  to  the  vowel  and  diphthong  stimuli  were  tabulated  separately  for  each 
speaker  in  each  condition.  The  data  on  the  mole  and  female  speakers,  respectively,  were  com¬ 
bined  and  are  presented  in  Table  I.  Each  stimulus  (vowel  or  diphthong)  elicited  a  total  of  2400 
responses;  eight  speakers  presented  each  vowel  or  diphthong  five  times  under  ten  conditions  to  six 
trained  listeners.  The  number  and  percentage  of  correct  identifications  of  each  stimulus  were 
tabulated  and  are  presented  in  Table  11. 
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TABLE  1  -  CORRECT  RESPONSES  AT  DIFFERENT  LEVELS  OF  QUIET  (Q)  AND  NOISE  (N) 

LEVEL  I  LEVEL  II  LEVEL  111  LEVEL  IV  LEVEL  V  TOTAL  TOTAL  TOTAL 
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Total  922  873  1533  1575  2275  2298  2828  2817  3212  3214  10770  10777  21547 


TABLE  II 


NUMBER  AND  PERCENTAGE  OF  CORRECT  IDENTIFICATIONS  FOR  2400 
PRESENTATION  OF  EACH  OF  16  VOWELS  OR  DIPHTHONGS 


Vowel  or  Diphthong 

Correct 

Identifications 

© 

Number 

Per  Cent 

e 

1743 

72.6 

X 

1718 

71  .6 

a 

1716 

71  .5 

e 

1686 

70.3 

QI  • 

1580 

65.8 

A 

1556 

64.8 

QU 

1482 

61  .8 

0 

1443 

60.1 

o 

1397 

58.2 

DI 

1307 

54.5 

32 

t  1245 

51  .9 

i 

1232 

51  .3 

3^ 

1053 

43.9 

U 

1047 

43.6 

u 

736 

30.7 

lu 

606 

25.3 

The  h/pothesis  of  no  difference  between  the  frequencies  of  correct  identification  was 
tested  by  means  of  a  Chi  Squared  One  Sample  Test.  This  yielded  a  value  of  1327  with  15  de¬ 
gress  of  freedom;  a  value  of  37.70  is  needed  for-  .001  significance.  It  can  be  concluded  that 
the  intelligibility  of  the  vowels  and  diphthongs,  measured  by  per  cent  correct  identification, 
are  very  significantly  different. 

Additionally,  the  data  were  analyzed  to  determine  which  of  any  two  vowels  or  diphthongs 
is  significantly  more  intelligible  under  a  variety  of  conditions .  Previously  described,  these  varied 
conditions  involved  noise  or  quiet,  differing  signal-to-noise  ratios  or  signal  intensities,  and  sex 
differences  in  speakers.  There  were  20  possible  variations  or  combinations  of  these  conditions; 
listeners  heard  a  male  or  female  speaker,  in  noise  or  quiet,  at  each  of  five  levels.  Thus,  for  * 
each  vowel  or  diphthong  there  were  20  intelligibility  values  obtained  under  these  differing 
conditions . 

For  any  given  pair  of  stimuli  (vowel  or  diphthong)  there  were  20  related  pairs  of  in¬ 
telligibility  values.  For  example,  Ca]  and  C  o]  had  58  and  45  correct  identifications  when 
heard  at  level  1  in  quiet  from  male  speakers  (see  Table  l).  The  set  of  20  such  related  values 
was  tabulated  for  each  possible  paiPof  stimuli;  there  were  120  such  sets  of  related  values. 
Wilcoxin's  Test  for  Matched  Pairs,  for  each  pair  of  vowels  or  diphthongs,  was  computed.  The 
resultant  120  _T  values,  along  with  their  significances,  are  presented  in  Table  III. 
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WiLCOXIN  T  VALUES  FOR  THE  DIFFERENCES  IN  INTELLIGIBILITY  OF  PAIRS  OF  VOWELS  OR  DIPHTHONGS 
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in  each  TOW  of  the  table,  one^can  see  which  vowels  or  diphthongs  are  significantly  less 
intel ligibile  than  the  vowel  or  diphthong  which  heads  that  row.  The  columns  of  the  table  reveal 
„whicb  vowels  or  diphthongs  are  significantly  more  intelligible  than  the  vowel  or  diphthong  which 
heads  the  column  .  For  the  vowel  Co],  for  example,  the  vowels  and  diphthongs  tcs]  ,  C  i  3  ,  [  ?], 
^Cu],  Cu]  and  [  iu3  are  significantly  less  intelligible,  and  the  vov/els  Le  3^  ti3,  [a3,  [e3, 
[ai3,  and  [a3  are  significantly  more  intelligible.  It  is  interesting  to  note  that  84  of  the  120 
pairs  of  stimuli  are  significantly  different  in  intelligibility  .  Two  conclusions  seem  warranted . 

*  As  previously  confirmed,  the  vowels  and  diphthongs,  as  a  group  of  stimuli,  are  significantly 

different  in  intelligibility.  Secondly,  this  significant  difference  is  not  the  result  of, one  or  two 
quite  unintelligible  vowels  among  a  majority  of  vowels  about  equally  intelligible.  Rather,  there 
are  significant  differences  between  most  pairs  of  vowels  or  diphthongs.  This  implies  that  there  is 
a  fairly  stable  order  of  intelligibility  of  the  vowels  and  diphthongs. 

2.  Is  the  rank  order  of  vowel  and  diphthong  intelligibility  similar  in  quiet  and  in  noise? 

The  fact  that  vowels  and  diphthongs  differ  in  intelligibility  having  been  determined,  data 
were  analyzed  to  determine  if  the  rank  order  of  intelligibility  was  similar  in  noise  and  in  quiet  at 
each  of  the  levels  previously  described.  Essentially,  do  the  correlations  between  noise  and  quiet 
rankings  increase  or  decrease  as  the  stimulus  becomes  clearer?  The  intelligibilities  of  the  vowels 
and  diphthongs  in  noise  and  quiet  were  ranked  for  each  stimulus  level  and  rho  coefficients  were 
computed.  For  levels  1  through  5  consecutively,  the  obtained  values  were:  .90,  .93,  .93,  .72, 
and  .71  .  Two  observations  can  be  made  from  these  findings.  First,  the  rank  orders  of  vowel- 
diphthong  intelligibility  in  quiet  and  in  noise  are  significantly  related.  Second,  it  appears  that 
as  the  stimuli  (vowel  and  diphthong)  increase  in  clarity,  the  relationship  between  rank  orders 
declines.  The  significances  of  the  differences  between  these  correlation  coefficients  do  not  reach 
the  .05  level  but  do  approach  it, 

3.  Does  the  relative  intelligibility  of  the  vowels  and  diphthongs  remain  the  same  as  the 
intensity  of  the  stimuli  are  successively  increased? 


The  stability  of  the  rank  order  of  vowel-diphthong  intelligibility  was  investigated  between 
the  five  levels  involving  successive  increases  in  signal-to-noise  ratios  or  signal  intensity.  As  the 
clarity  of  the  stimulus  increases,  the  vowels  end  diphthongs,  of  course,  become  more  intelligible. 
However,  it  is  of  interest  to  know  i f  the  rate  of  increase  in  intel ligibi  1  ity  is  similar.  Do  some 
vowels  or  diphthongs  increase  in  intelligibility  at  a  more  rapid  rcj^e  than  others?  The  means  of 
answering  this  question  was  to  compute  correlation  coefficients  between  the  various  levels  of  stim¬ 
ulus  presentation.  If  the  rate  of  increase  in  intelligibility  was  similar  for  these  vowels  and  diph¬ 
thongs,  the  correlations  between  the  levels  should  be  uniformly  high. 

The  data  in  quiet  and  in  noise*A’ere  combined;  previous  results  revealed  that  the  rank 
orders  in  quiet  and  in  noise  were  quite  similar.  The  numbers  of  correct  identifications  of  the 
vowels  and  diphthongs  for  each  level  were  correlated  with  every  other  level  .  Rho  coefficients 
were  obtained  and  are  presented  in  Table  IV.  ,  ■ 

I 

Several  observations  can  be  made  from  on  examination  of  Table  IV.  First,  the  intelli- 
gibilties  of  the  vowels  and  diphthongs  do  not  improve  at  a  uniform  rate  as  the.  listening  conditions 
become  easier.  Some  stimuli,  for  example  the  diphthong  [  oi3  (see  Table  I),  are  found  to  have 
substantially  improved  relative  intelligibility  as  listening  becomes  easier,  while  others  such  as 
[  d3  bcome  relatively  less  intelligible.  (Decreases  in  the  relative  intelligibility  of  certain 
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stimuli  must  accompany  increases  for  other  items;  however,  this  does  not  imply  that  the  per¬ 
centage  of  correct  identifications  of  any  stimulus  actually  decreases  under  Improved  listening 
conditions.)  Second,  the  greatest  relationship  between  orders  of  vowel-diphthong  intelligibility 
is  found  among  adjacent  levels  of  stimulus  clarity.  The  further  removed  the  levels,  the  less 
the  relationship.  These  two  observations  suggest  the  conclusion  that,  in  quiet,  noise,  or  quiet 
and  noise  combined,  the  order  of  vowel-diphthong  intelligibility  at  one  level  of  listening 
difficulty  would  not  be  a  good  predictor  of  the  order  at  a  very  different  level  of  listening 
di  fficulty . 

•  TABLE  IV 


CORRELATIONS  OF  VOWEL  AND  DIPHTHONG  INTELLIGIBILITY  VALUES 
BETWEEN  FIVE  LEVELS  OF  STIMULUS  CLARITY 


LEVELS 

1  2 

_  3 

4 

5 

1 

.86 

.79 

.67 

.36 

2 

00 

.90 

.64 

3 

.92 

.74 

4 

.86 

5 


4.  Is  the  rank  order  of  vowel-diphthong  intelligibility  consistent  among  speakers? 

The  data  were  analyzed  with  respect  to  speaker  effect  on  the  ralalive  intelligibility  of 
vowels  and  diphthongs.  Are  there  differences  in  the  rank  order  of  vowel-diphthong  Intelligibility 
as  spoken  by  different  speakers?  Eiglit  speakers  were  used  to  present  the  stimuli;  the  rank  order 
of  intelligibility  was  determined  for  each  speaker  in  quiet  and  in  noise.  An  average  intercor¬ 
relation  of  ranks  was  computed  for  the  eight  rankings  in  quiet;  a  similar  computation  was  made 
for  rankings  in  noise.  The  obtained  average  correlations  were  .49  and  .46  respectively,  v/hich 
are  significant  at  the  .01  level.  These  values  expose  significant  relationship  among  different 
speakers  with  respect  to  the  order  of  vowel-diphthong  intelligibility. 

5.  Do  male  and  female  speakers' produce  similar  intelligibility  values  for  the  vowels  and 
diphthongs? 

The  speakers  included  four  males  and  four  females.  This  permitted  a  study  of  thie  effects 
of  sex  on  vowel-diphthong  intelligibility.  Coordinate  with  the  previous  procedure,  the  intelli¬ 
gibility  values  of  the  male  and  female  speakers  v/ere  ranked  and  analyzed  separately.  The  aver¬ 
age  intercorrelations  of  the  rankings  of  the  male  speakers  were  .62  in  quiet  and  .42  in  noise. 

For  female  speakers,  the  corresponding  values  were  .64  and  .61  .  These  generally  larger  values 
for  males  and  females  separately,  when  compared  with  the  combined  values  of  .49  and  .46,  shown 
above,  suggest  that  speakers  of  the  same  sex  produce  the  vowels  and  diphthongs  more  alike  with 
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with  respect  to  intelligibility  than  speakers  of  different  sexes. 

Additional  differences  were  found  between  mole  and  female  speakers.  Male  speakers 
appear  to  be  more  Intelligible  in  vowel  end  diphthong  production,  and  particularly  more  in- 
telligble  in  noise.  The  intelligibility  values  of  the  male  and  female  speakers  were  summed 
separately  for  each  level  of  stimulus  presentation;  this  was  doige  for  quiet;  for  noise,  and  for 
these  conditions  combined.  For  each  of  these  conditions,  the  male  and  female  speakers  pro¬ 
vided  two  values  for  each  of  the  16  vowels  and  diphthongs.  The  Wilcoxin  Test  for  Matched 
Pairs  was  computed  for  each  of  these  sets  of  paired  values .  The  results  are  presented  in  Table  V . 

TABLE  V  ® 

RESULTS  OF  WILCOXIN  TEST  OF  THE  DIFFERENCE  IN  INTELLIGIBILITY  VALUES  BETWEEN 
MALE  AND  FEMALE  SPEAKERS  UNDER  VARIOUS  CONDITIONS 


LEVEL 

CONDITION 

QUIET 

NOISE 

QUIET  AND  NOISE 

Level  1 

T  =  45  .5 

T=  12.5"* 

z  =  2.47* 

Level  2 

T  =  53.3 

T  -  25 .5* 

z  =  2.03* 

Level  3 

T  =  38  .5 

T=  7.5** 

z  =  3.00** 

Level  4 

T  =  17  ** 

T=  6.5** 

z  =  4.15** 

Level  5 

T  =  31  .5 

T  =  8  ** 

z  =  3.69** 

All  Levels 

T  =  42 .0 

T  =  3  ** 

z  =  3.44** 

*Significant  at  .03  level 
**Significant  at  .01  level 


In  all  the  conditions  of  presentation,  the  male  spealcers  had,  in  varying  amounts,  superior 
intelligibility.  However,  it  is  only  under  conditions  of  noise  accoi.^panying  the  stimulus  that  the 
male  spealcers  are  consistently  and  significantly  more  intelligible.  These  findings  do  not  appear 
to  be  due  to  differences  in  quality  or  to  differences  in  loudness  between  male  and  female  speakers. 
The  methodology  of  the  study  included  (1)  an  attempt  to  equate  the  male  and  female  speakers  for 
quality,  and  (2)  an  actual  equating  of  all  voices  for  loudness.  Therefore,  it  appears  that  males 
more  intelligibly  communicate  vowels  and  diphthongs  than  do  females.  The  relative  superiority 
of  the  male  speakers  in  noise  is  even  more  strongly  supported  by  the  data. 

Finally,  it  was  noted  (see  Table  1}  that  as  a  group  male  speakers  are  more  intelligible  in 
noise  than  in  quiet,  and  female  speakers  are  more  intelligible  in  quiet  than  in  noise.  Though 
the  conditions  of  quiet  and  noise  were  equated  for  difficulty  for  all  speakers  combined  at  each 
of  five  levels,  male  speakers  are  consistently  superior  in  noise  in  comparison  with  quiet;  female 
speakers  are  consistently  superior  in  quiet  in  comparison  with  noise.  These  superiorities  are 
significant  at  the  .01  confidence  level  .  Given  a  choice  betv/een  speech  transmissions  through 
noise  or  an  equally  difficult  condition  of  lov/  signal  intensity,  the  male  voice  probably  will  be 
more  intelligible  with  the  former  and  the  female  voice  will  be  more  intelligible  with  the  latter 
condition.  This  difference  of  conditions  which  favor  the  male  and  female  voice  may  be  restricted 
to  lov/-frequency  stimuli  such  as  vov/els. 


6.  What  are  the  principal  confusions  among  tho  vowels  and  diphthongs  under  conditions 
of  quiet  and  noise? 

® 

The  substitutions  of  other  vowels  or  diphthongs  for  the  stimulus  presented  were  tabulated 
for  each  of  the  16  vowels  and  diphthongs.  The  tabulations  for  the  conditions  of  quiet  and  noise  ■ 
were  made  separately.  These  are  presented  in  Table  VI. 

The  determination  of  principal  confusions  was  handled  by  first  hypothesizing  that  substi¬ 
tutions  of  one  vowel  for  another  could  reasonably  be  explained  as  guessing  behavior.  If  this  null 
hypothesis  can  be  rejected,  then  other  explanations  would  need  to  be  postulated.  The  most  likely 
of  these  explanations  was  that  two  given  vowels  truly  are  confused. 

if  nothing  but  guessing  behavior  is  assumed,  then  each  of  15  errors  for  a  particular  stimulus 
is  equally  probable.  Thus,  of  the  total  number  of  errors  made  for  a  given  stimulus,  each  type  of 
error  should  occur  6.67  per  cent  of  the  time  except  for  sampling  fluctuations.  For  example,  there 
were  258  errors  made  for  the  vowel  Le  ]  presented  in  quiet  (see  Table  VI).  Each  of  the  15  types 
of  errors  should  occur  as  a  pure  guess  17.2  times,  or  6.67  per  cent  of  258,  except  for  deviations 
due  to  sampling . 

Tlie  standard  error  of  this  expected  6.67  per  cent  for  each  type  of  error  was  computed. 

From  this  value,  the  .01  limits  of  the  percentages  one  might  obtain  in  samples,  sizes  of  258  were 
found.  The  upper  limit,  at  the  .01  level,  was  10.67  per  cent.  Expressed  as  a  number  of  errors, 
this  would  be  27.5.  Error  frequencies  for  the  vowel  [  £  ]  in  quiet  which  exceed  27.5  cannot  be 
explained  adequately  as  guessing  behavior.  In  this  case,  the  vowels  Ci  ]  ,  [o3,and[£e]  show 
significant  departures  from  expected  frequencies;  that  is,  they  are  significant  confusions  for  Ce  3  . 

The  asterisks  in  Table  VI  indicate  that  there  is  at  least  one  significant  departure  from  the 
expected  frequency  of  error  (or  confusion)  for  every  stimulus  presented  to  listeners.  A  total  of  99 
such  significant  error  frequencies  v/ere  found,  indicating  that  guessing  and  sampling  are  not  ade¬ 
quate  explanations  of  these  high  frequencies.  This  led  to  the  conclusion  that  a  number  of  vowels 
and  diphthongs  are  truly  confused  with  each  other, 


Finally,  it  needs  to  be  noted  that  the  design  of  this  study  and  the  analysis  described  above 
provide  an  identification  of  the  principal  confusions,  though  not  necessarily  all  confusions.  This 
occurs  because  each  potential  confusion  is  not  independent  of  the  others.  For  example,  in  the  last 
row  of  Table  VI,  the  vowels  ti  3  ond  Lu  3  are  the  principal  confusions  for  [iu3  .  These  two  con¬ 
fusions  account  for  549  of  a  total  of  702  errors.  The  remaining  153  errors  are  distributed  over  the 
other  13  possible  substitution  errors.  Any  of  these  13  possible  errors  would  need  to  occur  more 
than  63  times  to  be  classified  as  a  confusion.  Thus,  the  principal  confusions  tend  to  mask  other 
potential  confusions;  if  the  principal  confusions  were  removed,  others  might  well  be  exposed . 


7.  Do  related  pairs  of  errors  produce  similar  confusobility  values? 

When  errors  for  vowels  and  diphthongs  ore  examined,  it  can  be  noted  that  each  error  Is  one 
of  a  pair.  For  example,  the  vowel  [  a3  can  be  an  error  for  [  £  3  ,  and  [  E  3  c.an  be  an  error  for  [  a3  . 

Analysis  of  the  data  (Table  VI)  supports  the  conclusion  that  paired  errors  produce  somewhat 
similar  confusobility  values.  This  was  studied  by  means  of  rank  order  correlations.  For  example, 
the  [  a3  was  on  error  for  the  vowel  [  E  3  in  10  instances,  and  the  vowel  C  £  3  was  an  error  for  the 
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Significant  confusion  at  .01  level 


vowel  [q]  12  times;  similarly  the  vowel  C  i]  was  an  error  for  the  vowel  [  e  ]  63  times,  and  the 
vowel  [  e  ]  was  an  error  for  C  i]  84  times.  Fifteen  such  pairs  of  error  frequencies  were  associated 
with  each  vowel  or  diphthong  .  For  each  of  the  16  vowels  or  diphthongs,  a  rank  order  correlation 
was  computed.  The  coefficients  yielded  are  presented  in  Table  VII. 

TABLE  VII 

RELATIONSHIPS  BETWEEN  RELATED  CONFUSIONS  FOR  EACH  OF  16 

VOWELS  OR  DIPHTHONGS 

Vowel  or  Diphthong  Rank  Order  Correlation 


e 

.79 

I 

.72 

a 

.90 

e 

.73 

ax 

.55 

A 

.67 

au 

.52 

D 

.70 

0 

.25 

oi 

.70 

as 

.69 

i 

.50 

3^ 

.68 

u 

.74 

u 

.65 

lU 

.76 

Additional  information  was  gained  on  the  nature  of  vowel  confusions  by  plotting  each 
vowel  at  the  intersection  of  the  first  two  formants  using  the  data  obtained  by  Peterson  and 
Barney.l2  diphthongs,  identified  by  broken  lines  to  represent  movement  from  one  vowel 

position  to  another,  were  drown  somewhat  out  of  position  to  permit  the  construction  of  con¬ 
fusion  vectors.  The  intelligibility,  the  prinicpal  confusions  (A,  B,  C,),  and  the  totals  were 
then  entered,  resulting  in  Figure  2  and  Figure  3. 

The  confusions  observed  in  both  quiet  and  noise  follow  the  predominantly  horizontal 
pattern  that  would  be  predicted  from  formant  analysis,  and  are  strikingly  similar  to  the  results 
Pickett^^  found  with  vowels  in  flat  and  high-frequency  noises  and  to  the  results  of  Miller's 
study^^  in  which  a  low-pass  filter  was  used  to  deliberately  remove  the  higher  frequencies.  In 
both  of  the  latter  studies  the  vowels  were  presented  in  a  consonantal  context. 

Miller  states:  "Most  of  these  (confusion)  lines  run  horizontally,  which  is  what  we  would 
expect  if  the  vowels  were  projected  onto  the  ordinate  as  a  result  of  removing  all  information 
about  their  position  on  the  abscissa.  However,  there  are  minor  deviatioris  from  this  rule:  con¬ 
fusions  between  had  and  hud  ([33]  and  tAl);  between  hud  and  hawed  (Ca]  and  fo]);  and 
between  head  and  hawed  ((e  ]  and  Co])  should  have  occurred  but  did  not,."  He  concludes: 
"The  simplest  explanation  of  these  discrepancies  is  that  hid,  hood,  head,  and  hud  contained 
short  vowels,  whereas  heed,  who'd,  had,  hod,  and  hawed  contain  long  vowels."  While 
there  are  several  differences  between  the  study  by  Miller  and  the  present  study,  one  difference 
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PRINCIPAL 


CONFUSIONS  AMONG  THE  VOWELS  AND 


diphthongs  in  quiet 


is  that  the  voweU  In  the  present  study  were  sustained  and  therefore  had  similar  temporal 
characteristics.  The  point  of  interest  is  that,  in  the  present  study,  the  confusions  were 
found  which  Miller  reported  should  have  occurred,  and  these  can  be  observed  in  Figure  2 
and  Figure  3 . 

V  CONCLUSIONS 


The  data  obtained  relative  to  the  principal  questions  raised  in  this  study  provide  the 
bases  for  a  series  of  conclusions;  these  are  summarized  below  in  the  sequence  of  those  questions 
listed  in  Section  III . 


1  a . 

b. 

2. 

3  a . 


b . 

4. 

5  a . 

b. 

c . 

6  a . 

b . 

c . 


7. 


The  intelligibility  values  of  the  vowels  and  diphthongs  are  very  significantly  different 
at  low  intensities  and  relatively  difficult  signal-to-noise  ratios. 

The  differences  in  vowel  and  diphthong  intelligibility  values  reveal  a  fairly  stable 
order  of  intelligibility.  This  order,  from  most  to  least  intelligible,  is  [  e  ]  ,  ti], 

[a],  Ce],  [ai],  [a],  C  au  ]  ,  Co],  Co],  C  =i  ]  ,  Lee],  Ei],  E  ?]  ,  Eu],  Eu], 
and  E lu  ]  . 

The  order  of  vowel  and  diphthong  intelligibility  under  conditions  of  quiet  and  noise 
is  highly  related.  The  relationship  is  highest  at  the  lowest  levels  of  signal  clarity. 

As  signal  clarity  improves,  some  vowels  and  diphthongs  increase  in  intelligibility  at 
a  more  rapid  rate  than  others.  Similarly,  the  converse  is  probably  true;  some  vowels 
and  diphthongs  decrease  in  intelligibility  more  rapidly  than  others  as  the  signal  clarity 
declines.  Therefore,  the  order  of  vowel-diphthong  intelligibility  at  one  level  of 
listening  difficulty  would  not  be  a  good  predictor  of  the  order  under  a  very  different 
level  of  listening  difficulty. 

Conclusions  (2)  and  (3a)  suggest  that  changes  in  level  of  signal  clarity  affect  the 

order  of  vov/el  end  diphthong  intelligibility  more  than  change  from  quiet  to  noise 
•  •  ® 
or  noise  to  quiet .  ^ 

The  rank  orders  of  vowel-diphthong  intelligibility  among  different  speakers  under 
similar  conditions  are  significantly  related,  though  not  sufficiently  for  accurate 
prediction.  Average  correlations  for  quiet  and  noise  conditions  were  .49  and  .46. 
Speakers  of  the  same  sex  are  more  likely  to  have  similar  orders  of  vowel-diphthong 
intelligibility  than  speakers  of  different  sexes. 

Differences  between  male  and  female  speakers  in  vowel-diphthong  intelligibility 
was  greater  in  noise  than  in  quiet;  the  male  speakers  were  superior  in  both  conditions. 
Female  speakers  were  significantly  irore  intelligible  in  quiet  than  in  noise;  male 
speakers  were  significantly  more  intelligible  in  noise  than  in  quiet. 

For  each  of  the  16  vowels  and  diphthongs,  at  least  one  of  the  15  other  vowels  or 
diphthongs  is  a  significant  confusion. 

Approximately  ■^O  per  cent  of  all  possible  confusions,  in  this  closed  matrix  of  vowels 
and  diphthongs,  were  found  to  be  significant  confusions.  If  these  principal  confusions 
are  masking  others,  the  number  would  still  be  greater. 

The  vowel  Li]  is  the  most  frequently  occurring  confusion  for  another  vowel  or  diph¬ 
thong  in  both  quiet  ond  noise.  Furthermore,  the  vowel  E  i]  niost  frequently  results 
in  omission  responses.  It  is  interesting  to  note  that  in  a  study  of  the  English  digits, 
the  digit  THREE,  containing  the  vowel  Ei],  is  the  least  intelligible,  and  the  most 
frequently  confused  wilh  other  digits. 

Pairs  of  confusions  ore  significontly  related;  for  example,  the  relative  frequency  of 
Ea]  as  a  substitute  for  Ee  ]  is  highly  related  to  the  relative  frequency  of  E  £  ]  as  a 
substitute  for  Ea  ]  .  Confusions  appear  to  bear  a  reciprocal  relationship  to  one 
another.  17 
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